Search CORE

113 research outputs found

Convex optimization over intersection of simple sets: improved convergence rate guarantees via an exact penalty approach

Author: Bach Francis
Bhattacharyya Chiranjib
Kundu Achintya
Publication venue
Publication date: 17/10/2017
Field of study

We consider the problem of minimizing a convex function over the intersection of finitely many simple sets which are easy to project onto. This is an important problem arising in various domains such as machine learning. The main difficulty lies in finding the projection of a point in the intersection of many sets. Existing approaches yield an infeasible point with an iteration-complexity of

O(1/\varepsilon^2)

for nonsmooth problems with no guarantees on the in-feasibility. By reformulating the problem through exact penalty functions, we derive first-order algorithms which not only guarantees that the distance to the intersection is small but also improve the complexity to

O(1/\varepsilon)

and

O(1/\sqrt{\varepsilon})

for smooth functions. For composite and smooth problems, this is achieved through a saddle-point reformulation where the proximal operators required by the primal-dual algorithms can be computed in closed form. We illustrate the benefits of our approach on a graph transduction problem and on graph matching

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

How Many Pairwise Preferences Do We Need to Rank A Graph Consistently?

Author: Bhattacharyya Chiranjib
Saha Aadirupa
Shivanna Rakesh
Publication venue
Publication date: 11/02/2019
Field of study

We consider the problem of optimal recovery of true ranking of

n

items from a randomly chosen subset of their pairwise preferences. It is well known that without any further assumption, one requires a sample size of

\Omega(n^2)

for the purpose. We analyze the problem with an additional structure of relational graph

G([n],E)

over the

n

items added with an assumption of \emph{locality}: Neighboring items are similar in their rankings. Noting the preferential nature of the data, we choose to embed not the graph, but, its \emph{strong product} to capture the pairwise node relationships. Furthermore, unlike existing literature that uses Laplacian embedding for graph based learning problems, we use a richer class of graph embeddings---\emph{orthonormal representations}---that includes (normalized) Laplacian as its special case. Our proposed algorithm, {\it Pref-Rank}, predicts the underlying ranking using an SVM based approach over the chosen embedding of the product graph, and is the first to provide \emph{statistical consistency} on two ranking losses: \emph{Kendall's tau} and \emph{Spearman's footrule}, with a required sample complexity of

O(n^2 \chi(\bar{G}))^{\frac{2}{3}}

pairs,

\chi(\bar{G})

being the \emph{chromatic number} of the complement graph

\bar{G}

. Clearly, our sample complexity is smaller for dense graphs, with

\chi(\bar G)

characterizing the degree of node connectivity, which is also intuitive due to the locality assumption e.g.

O(n^\frac{4}{3})

for union of

k

-cliques, or

O(n^\frac{5}{3})

for random and power law graphs etc.---a quantity much smaller than the fundamental limit of

\Omega(n^2)

for large

n

. This, for the first time, relates ranking complexity to structural properties of the graph. We also report experimental evaluations on different synthetic and real datasets, where our algorithm is shown to outperform the state-of-the-art methods.Comment: In Thirty-Third AAAI Conference on Artificial Intelligence, 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Second order cone programming approaches for handling missing and uncertain data

Author: Bhattacharyya Chiranjib
Shivaswamy Pannagadatta
Smola Alexander
Publication venue: 'MIT Press - Journals'
Publication date: 08/12/2015
Field of study

We propose a novel second order cone programming formulation for designing robust classifiers which can handle uncertainty in observations. Similar formulations are also derived for designing regression functions which are robust to uncertainties in the regression setting. The proposed formulations are independent of the underlying distribution, requiring only the existence of second order moments. These formulations are then specialized to the case of missing values in observations for both classification and regression problems. Experiments show that the proposed formulations outperform imputation

The Australian National University

Random Separating Hyperplane Theorem and Learning Polytopes

Author: Bhattacharyya Chiranjib
Kannan Ravindran
Kumar Amit
Publication venue
Publication date: 21/07/2023
Field of study

The Separating Hyperplane theorem is a fundamental result in Convex Geometry with myriad applications. Our first result, Random Separating Hyperplane Theorem (RSH), is a strengthening of this for polytopes. \rsh asserts that if the distance between

a

and a polytope

K

with

k

vertices and unit diameter in

\Re^d

is at least

\delta

, where

\delta

is a fixed constant in

(0,1)

, then a randomly chosen hyperplane separates

a

and

K

with probability at least

1/poly(k)

and margin at least

\Omega \left(\delta/\sqrt{d} \right)

. An immediate consequence of our result is the first near optimal bound on the error increase in the reduction from a Separation oracle to an Optimization oracle over a polytope. RSH has algorithmic applications in learning polytopes. We consider a fundamental problem, denoted the ``Hausdorff problem'', of learning a unit diameter polytope

K

within Hausdorff distance

\delta

, given an optimization oracle for

K

. Using RSH, we show that with polynomially many random queries to the optimization oracle,

K

can be approximated within error

O(\delta)

. To our knowledge this is the first provable algorithm for the Hausdorff Problem. Building on this result, we show that if the vertices of

K

are well-separated, then an optimization oracle can be used to generate a list of points, each within Hausdorff distance

O(\delta)

K

, with the property that the list contains a point close to each vertex of

K

. Further, we show how to prune this list to generate a (unique) approximation to each vertex of the polytope. We prove that in many latent variable settings, e.g., topic modeling, LDA, optimization oracles do exist provided we project to a suitable SVD subspace. Thus, our work yields the first efficient algorithm for finding approximations to the vertices of the latent polytope under the well-separatedness assumption

arXiv.org e-Print Archive